[vpj] [controller] Fix categorization of various errors in VPJ #2308
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem Statement
Few push errors in VPJ are currently not categorized as user errors and impact the SLA calculation for push job failures. Ww need to categorize errors appropriately to improve the accuracy of push job failures as part of SLA measurement.
In this PR we address 3 issues that are mis-categorized
keyorvalueschema mismatch between server and input datakeyorvaluewhich is not present in input data and results inVeniceSchemaFieldNotFoundExceptionSolution
The solution is to handle these errors appropriately in
VenicePushJoband invoking the correctPushJobCheckpointfor these errors.Note: We can follow up with PR to move the responsibility of checkpointing to the exceptions themselves. It consolidates the categorization and avoids scattered error handling which is prone to missing scenarios and edge cases.
Code changes
createVersioncode path to return newerrorTypefor ACL issuesACL_ERRORHow was this PR tested?
Will test E2E in LinkedIn setup to ensure the vheckpoints are reflected
Does this PR introduce any user-facing or breaking changes?